Binary versus Real-valued Reward Functions under Coevolutionary Reinforcement Learning

نویسندگان

Peter Lichodzijewski

Malcolm I. Heywood

چکیده

Models of coevolution supporting competitive and cooperative behaviors can be used to decompose the problem while scaling to large environmental state spaces. This work examines the significance of various design decisions that impact the deployment of a distinctionbased formulation of competitive coevolution. Specifically, competitive coevolutionary formulations with and without point population speciation are compared to stochastic sampling of the environment under both binary and real-valued rewards. The additional structure implicit in the competitive coevolutionary models is shown to be of significant benefit under binary rewards, however, stochastic sampling results in more dependable performance under real-valued feedback. It is also observed that cooperation between multiple solutions is much more prevalent under real-valued rewards than under binary rewards.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Maximal width learning of binary functions

This paper concerns learning binary-valued functions defined on IR, and investigates how a particular type of ‘regularity’ of hypotheses can be used to obtain better generalization error bounds. We derive error bounds that depend on the sample width (a notion similar to that of sample margin for real-valued functions). This motivates learning algorithms that seek to maximize sample width.

متن کامل

Gradient-Based Learning Updates Improve XCS Performance in Multistep Problems

This paper introduces a gradient-based reward prediction update mechanism to the XCS classifier system as applied in neuralnetwork type learning and function approximation mechanisms. A strong relation of XCS to tabular reinforcement learning and more importantly to neural-based reinforcement learning techniques is drawn. The resulting gradient-based XCS system learns more stable and reliable i...

متن کامل

Associative Reinforcement Learning of Real-valued Functions

|Associative reinforcement learning (ARL) tasks de ned originally by Barto and Anandan [1] combine elements of problems involving optimization under uncertainty, studied by learning automata theorists, and supervised learning pattern-classi cation. The stochastic real-valued (SRV) unit algorithm [6] has been designed for an extended version of ARL tasks wherein the learning system's outputs can...

متن کامل

Dynamic Obstacle Avoidance by Distributed Algorithm based on Reinforcement Learning (RESEARCH NOTE)

In this paper we focus on the application of reinforcement learning to obstacle avoidance in dynamic Environments in wireless sensor networks. A distributed algorithm based on reinforcement learning is developed for sensor networks to guide mobile robot through the dynamic obstacles. The sensor network models the danger of the area under coverage as obstacles, and has the property of adoption o...

متن کامل

A Policy Search Method For Temporal Logic Specified Reinforcement Learning Tasks

Reward engineering is an important aspect of reinforcement learning. Whether or not the users’ intentions can be correctly encapsulated in the reward function can significantly impact the learning outcome. Current methods rely on manually crafted reward functions that often requires parameter tuning to obtain the desired behavior. This operation can be expensive when exploration requires system...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2009

Binary versus Real-valued Reward Functions under Coevolutionary Reinforcement Learning

نویسندگان

چکیده

منابع مشابه

Maximal width learning of binary functions

Gradient-Based Learning Updates Improve XCS Performance in Multistep Problems

Associative Reinforcement Learning of Real-valued Functions

Dynamic Obstacle Avoidance by Distributed Algorithm based on Reinforcement Learning (RESEARCH NOTE)

A Policy Search Method For Temporal Logic Specified Reinforcement Learning Tasks

عنوان ژورنال:

اشتراک گذاری